Lecture 1: Introduction into R data analysis
First, import some data into a data structure using the READ command.
In [1]:
options(repr.plot.width=10, repr.plot.height=6.5) # this command just formats the size of the figures. Adapt to view them nicely
# in your browser.
In [2]:
mydatastructure = read.csv("YieldStrengthData.csv") # read csv file
mydata = mydatastructure$YieldStrength
In [3]:
mydatastructure
| YieldStrength |
|---|
| <dbl> |
| 12.0 |
| 11.8 |
| 14.2 |
| 10.5 |
| 12.3 |
| 15.4 |
| 9.8 |
| 11.1 |
| 13.4 |
| 12.5 |
| 11.4 |
| 10.8 |
| 1.2 |
| 14.8 |
| 15.0 |
| 13.5 |
| 14.1 |
| 11.2 |
| 11.2 |
| 13.5 |
| 12.3 |
| 12.1 |
| 11.6 |
In [4]:
plot(mydata,rep(0,length(mydata)), # plot(x,y) ; command rep = repeat.
xlab="Yield Strength (MPa)",
ylab=" ",
col="blue"
)
Histograms
In [5]:
hist(mydata,
breaks = seq(floor(min(mydata)),ceiling(max(mydata)),by=1),
plot=TRUE,
axes=TRUE,
xlab="Yield Strength (MPa)",
col="orange",
freq=TRUE)
Empirical Cumulative Distribution Function (ECDF)
In [6]:
plot(ecdf(mydata))
Mean vs. Median
In [7]:
plot(mydata,rep(0,length(mydata)), # plot(x,y) ; command rep = repeat.
xlab="Yield Strength (MPa)",
ylab=" ",
col="blue"
)
abline(v=mean(mydata), col="red",lwd=3)
mtext(paste("mean=", signif(mean(mydata),4)),col="red")
abline(v=median(mydata), col="blue",lwd=3)
mtext(paste("median=", signif(median(mydata),4)),col="blue",adj = 0)
In [8]:
# some random data
mydata <- rnorm(n = 1000,mean = 13,sd = 9)
In [9]:
# bipartite data
mydata <- c(rnorm(50,10,3),rnorm(5,300,0.3))
In [10]:
# asymmetric data
mydata <- c(rnorm(300,20,5),rnorm(150,40,10))
Quantiles
In [11]:
quantile(mydata, probs = c(0,0.1,0.3,0.4,0.5,0.75,1))
- 0%
- 6.12653045617659
- 10%
- 14.1905778794507
- 30%
- 18.9941605578449
- 40%
- 21.0653916427957
- 50%
- 23.5398728734673
- 75%
- 32.2687110060274
- 100%
- 67.4131892810316
Our first Boxplot
In [12]:
boxplot(mydata)
In [ ]: